Cluster ensembles
نویسندگان
چکیده
Cluster ensembles combine multiple clusterings of a set of objects into a single consolidated clustering, often referred to as the consensus solution. Consensus clustering can be used to generate more robust and stable clustering results compared to a single clustering approach, perform distributed computing under privacy or sharing constraints, or reuse existing knowledge. This paper describes a variety of algorithms that have been proposed to address the cluster ensemble problem, organizing them in conceptual categories that bring out the common threads and lessons learnt while simultaneously highlighting unique features of individual approaches. C © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 305–315 DOI: 10.1002/widm.32
منابع مشابه
Bayesian Cluster Ensembles
Cluster ensembles provide a framework for combining multiple base clusterings of a dataset to generate a stable and robust consensus clustering. There are important variants of the basic cluster ensemble problem, notably including cluster ensembles with missing values, as well as row-distributed or column-distributed cluster ensembles. Existing cluster ensemble algorithms are applicable only to...
متن کاملCluster-Based Cumulative Ensembles
In this paper, we propose a cluster-based cumulative representation for cluster ensembles. Cluster labels are mapped to incrementally accumulated clusters, and a matching criterion based on maximum similarity is used. The ensemble method is investigated with bootstrap re-sampling, where the k-means algorithm is used to generate high granularity clusterings. For combining, group average hierarch...
متن کاملCluster Ensembles for High Dimensional Clustering: An Empirical Study
This paper studies cluster ensembles for high dimensional data clustering. We examine three different approaches to constructing cluster ensembles. To address high dimensionality, we focus on ensemble construction methods that build on two popular dimension reduction techniques, random projection and principal component analysis (PCA). We present evidence showing that ensembles generated by ran...
متن کاملA CLUE for CLUster Ensembles
Cluster ensembles are collections of individual solutions to a given clustering problem which are useful or necessary to consider in a wide range of applications. The R package ̃clue provides an extensible computational environment for creating and analyzing cluster ensembles, with basic data structures for representing partitions and hierarchies, and facilities for computing on these, including...
متن کاملClustering Ensembles for Categorical Data
Cluster ensembles offer a solution to challenges inherent to clustering arising from its ill-posed nature. In this paper we focus on the design of ensembles for categorical data. Our approach leverages diverse input clusterings discovered in random subspaces. We experimentally demostrate the efficacy of our technique in combination with the categorical clustering algorithm COOLCAT.
متن کاملModerate diversity for better cluster ensembles
Adjusted Rand index is used to measure diversity in cluster ensembles and a diversity measure is subsequently proposed. Although the measure was found to be related to the quality of the ensemble, this relationship appeared to be non-monotonic. In some cases, ensembles which exhibited a moderate level of diversity gave a more accurate clustering. Based on this, a procedure for building a cluste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery
دوره 1 شماره
صفحات -
تاریخ انتشار 2011